STAT 456: Introduction to Statistical Theory
Lecture 2: Sampling Distributions — \(\bar{X}\) , \(S^2\) , \(\chi^2\) , \(t\) , \(F\)
2026-01-15
Sampling from the Normal Distribution
Today’s Topics:
The derived distributions: \(\chi^2\) , \(t\) , \(F\)
Properties of sample mean \(\bar{X}\) and sample variance \(S^2\)
The surprising independence of \(\bar{X}\) and \(S^2\) under normality
This is one of the most important lectures in the course. The independence result (Theorem 5.3.1) is foundational for all inference procedures.
Setup — Random Sample Notation
Definition (Random Sample)
\(X_1, \ldots, X_n\) are a random sample from population with pdf \(f\) if they are iid with common pdf \(f\) .
Sample statistics:
Emphasize the \(n-1\) in the denominator of \(S^2\) . This is the “Bessel correction” that makes \(S^2\) unbiased for \(\sigma^2\) .
Properties of \(\bar{X}\) and \(S^2\) — General Case
Theorem (No normality assumed)
Let \(X_1, \ldots, X_n\) be a random sample from a population with mean \(\mu\) and variance \(\sigma^2 < \infty\) . Then:
(a) \(E[\bar{X}] = \mu\)
(b) \(\text{Var}(\bar{X}) = \sigma^2/n\)
(c) \(E[S^2] = \sigma^2\)
Note: Part (c) explains the \(n-1\) denominator in \(S^2\) — it makes \(S^2\) an unbiased estimator of \(\sigma^2\) .
Chi-Squared Distribution — Recall
The chi-squared distribution with \(p\) degrees of freedom, denoted \(\chi^2_p\) , has pdf: \[f(x) = \frac{1}{\Gamma(p/2) 2^{p/2}} x^{(p/2)-1} e^{-x/2}, \quad x > 0\]
This is Gamma\((p/2, 2)\) .
Lemma (Key Facts about Chi-Squared)
(a) If \(Z \sim N(0,1)\) , then \(Z^2 \sim \chi^2_1\)
(b) If \(X_1, \ldots, X_k\) independent with \(X_i \sim \chi^2_{p_i}\) , then \(\sum X_i \sim \chi^2_{\sum p_i}\)
MAIN THEOREM — Properties under Normality
Theorem (Independence and Distributions under Normality)
Let \(X_1, \ldots, X_n\) be a random sample from \(N(\mu, \sigma^2)\) . Then:
(a) \(\bar{X}\) and \(S^2\) are independent random variables
(b) \(\bar{X} \sim N(\mu, \sigma^2/n)\)
(c) \((n-1)S^2/\sigma^2 \sim \chi^2_{n-1}\)
Key observations:
Part (a) is surprising! The sum and spread of a normal sample are independent.
Part (b) follows from properties of normal (sum of normals is normal)
Part (c) requires proof — we’ll do this in detail
PROOF of Independence (Part a)
Goal: Show \(\bar{X}\) and \(S^2\) are independent.
Strategy: Show \(\bar{X}\) and \((X_2 - \bar{X}, \ldots, X_n - \bar{X})\) are independent, then note \(S^2\) is a function of the latter.
Step 1: Write \(S^2\) in terms of deviations
\[S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2 = \frac{1}{n-1}\sum_{i=2}^n (X_i - \bar{X})^2 + \frac{1}{n-1}(X_1 - \bar{X})^2\]
Since \(\sum_{i=1}^n (X_i - \bar{X}) = 0\) , we have: \[X_1 - \bar{X} = -\sum_{i=2}^n (X_i - \bar{X})\]
So \(S^2\) is a function of \((X_2 - \bar{X}, \ldots, X_n - \bar{X})\) only.
Step 2: Transform to show independence
WLOG assume \(\mu = 0\) , \(\sigma = 1\) . Joint pdf of \((X_1, \ldots, X_n)\) : \[f(x_1, \ldots, x_n) = \frac{1}{(2\pi)^{n/2}} \exp\left(-\frac{1}{2}\sum_{i=1}^n x_i^2\right)\]
Step 3: Make the transformation
\[y_1 = \bar{x}, \quad y_i = x_i - \bar{x} \text{ for } i = 2, \ldots, n\]
This is linear with Jacobian \(= 1/n\) .
Inverse transformation: \[x_1 = y_1 - \sum_{i=2}^n y_i, \quad x_i = y_i + y_1 \text{ for } i \geq 2\]
Step 4: Compute joint density of \((Y_1, \ldots, Y_n)\)
After substitution: \[\sum_{i=1}^n x_i^2 = \left(y_1 - \sum_{i=2}^n y_i\right)^2 + \sum_{i=2}^n (y_i + y_1)^2\]
Expanding and simplifying: \[= ny_1^2 + \sum_{i=2}^n y_i^2 + \left(\sum_{i=2}^n y_i\right)^2\]
Step 5: Observe factorization
\[f(y_1, \ldots, y_n) = \frac{n}{(2\pi)^{n/2}} \exp\left(-\frac{ny_1^2}{2}\right) \exp\left(-\frac{1}{2}\left[\sum_{i=2}^n y_i^2 + \left(\sum_{i=2}^n y_i\right)^2\right]\right)\]
This factors as:
\[= \underbrace{\sqrt{\frac{n}{2\pi}} e^{-ny_1^2/2}}_{\text{density of } Y_1 = \bar{X}} \times \underbrace{g(y_2, \ldots, y_n)}_{\text{joint density of } Y_2, \ldots, Y_n}\]
Since joint pdf factors: \(Y_1 = \bar{X}\) is independent of \((Y_2, \ldots, Y_n)\) .
Therefore \(\bar{X}\) is independent of \(S^2\) . \(\blacksquare\)
PROOF of Chi-Squared Distribution (Part c)
Goal: Show \((n-1)S^2/\sigma^2 \sim \chi^2_{n-1}\) .
Strategy: Induction on \(n\) .
Notation: Let \(\bar{X}_k\) and \(S_k^2\) denote sample mean and variance based on first \(k\) observations.
Key Identity: \[(n-1)S_n^2 = (n-2)S_{n-1}^2 + \frac{n-1}{n}(X_n - \bar{X}_{n-1})^2\]
Base case (\(n=2\) ):
\[S_2^2 = \frac{1}{1}\left[(X_1 - \bar{X}_2)^2 + (X_2 - \bar{X}_2)^2\right] = \frac{(X_1 - X_2)^2}{2}\]
So \((n-1)S_2^2/\sigma^2 = (X_1 - X_2)^2/(2\sigma^2)\) .
Since \(X_1 - X_2 \sim N(0, 2\sigma^2)\) , we have: \[(X_1 - X_2)/(\sigma\sqrt{2}) \sim N(0,1)\]
Therefore: \[S_2^2/\sigma^2 = [(X_1-X_2)/(\sigma\sqrt{2})]^2 \sim \chi^2_1 \quad \checkmark\]
Inductive step:
Assume \((n-2)S_{n-1}^2/\sigma^2 \sim \chi^2_{n-2}\) .
From the key identity: \[\frac{(n-1)S_n^2}{\sigma^2} = \frac{(n-2)S_{n-1}^2}{\sigma^2} + \frac{(X_n - \bar{X}_{n-1})^2}{\sigma^2 \cdot n/(n-1)}\]
Now \(X_n\) is independent of \(X_1, \ldots, X_{n-1}\) , hence independent of \(\bar{X}_{n-1}\) and \(S_{n-1}^2\) .
Computing the distribution of the second term:
\[X_n - \bar{X}_{n-1} \sim N\left(0, \sigma^2 + \frac{\sigma^2}{n-1}\right) = N\left(0, \frac{\sigma^2 n}{n-1}\right)\]
So: \[\frac{(X_n - \bar{X}_{n-1})^2}{\sigma^2 n/(n-1)} \sim \chi^2_1\]
By independence and the additive property of chi-squared: \[\frac{(n-1)S_n^2}{\sigma^2} \sim \chi^2_{n-2} + \chi^2_1 = \chi^2_{n-1}\]
\(\blacksquare\)
Student’s \(t\) Distribution — Definition and Derivation
Motivation: When \(\sigma\) is unknown, replace it with \(S\) :
\[\frac{\bar{X} - \mu}{\sigma/\sqrt{n}} \sim N(0,1) \quad \text{but } \sigma \text{ unknown}\]
\[\frac{\bar{X} - \mu}{S/\sqrt{n}} = \frac{(\bar{X} - \mu)/(\sigma/\sqrt{n})}{\sqrt{S^2/\sigma^2}} = \frac{Z}{\sqrt{V/(n-1)}}\]
where \(Z \sim N(0,1)\) , \(V = (n-1)S^2/\sigma^2 \sim \chi^2_{n-1}\) , and \(Z \perp V\) .
DERIVATION of \(t\) Distribution PDF
Goal: Derive the PDF of \(T = Z/\sqrt{V/p}\) where \(Z \sim N(0,1)\) , \(V \sim \chi^2_p\) , \(Z \perp V\) .
Step 1: Joint PDF of \((Z, V)\)
Since \(Z\) and \(V\) are independent: \[f_{Z,V}(z, v) = f_Z(z) \cdot f_V(v)\]
\[= \frac{1}{\sqrt{2\pi}} e^{-z^2/2} \cdot \frac{1}{\Gamma(p/2) 2^{p/2}} v^{(p/2)-1} e^{-v/2}\]
for \(-\infty < z < \infty\) and \(v > 0\) .
Step 2: Transformation
Define: \[t = \frac{z}{\sqrt{v/p}}, \quad w = v\]
This gives us: \[z = t\sqrt{w/p}, \quad v = w\]
Step 3: Compute the Jacobian
\[J = \begin{vmatrix} \frac{\partial z}{\partial t} & \frac{\partial z}{\partial w} \\ \frac{\partial v}{\partial t} & \frac{\partial v}{\partial w} \end{vmatrix} = \begin{vmatrix} \sqrt{w/p} & \frac{t}{2\sqrt{pw}} \\ 0 & 1 \end{vmatrix} = \sqrt{\frac{w}{p}}\]
Therefore: \(|J| = \sqrt{w/p}\)
Step 4: Joint PDF of \((T, W)\)
\[f_{T,W}(t, w) = f_{Z,V}(z(t,w), v(t,w)) \cdot |J|\]
\[= \frac{1}{\sqrt{2\pi}} e^{-t^2w/(2p)} \cdot \frac{1}{\Gamma(p/2) 2^{p/2}} w^{(p/2)-1} e^{-w/2} \cdot \sqrt{\frac{w}{p}}\]
\[= \frac{1}{\sqrt{2\pi p} \, \Gamma(p/2) 2^{p/2}} w^{p/2} e^{-w(1 + t^2/p)/2}\]
Step 5: Marginal PDF of \(T\) (integrate out \(W\) )
\[f_T(t) = \int_0^\infty f_{T,W}(t, w) \, dw\]
\[= \frac{1}{\sqrt{2\pi p} \, \Gamma(p/2) 2^{p/2}} \int_0^\infty w^{p/2} e^{-w(1 + t^2/p)/2} \, dw\]
Step 6: Recognize the Gamma kernel
The integrand has the form of a \(\text{Gamma}\left(\frac{p+1}{2}, \frac{2}{1 + t^2/p}\right)\) density.
Recall that for Gamma\((\alpha, \beta)\) : \[\int_0^\infty \frac{1}{\Gamma(\alpha)\beta^\alpha} x^{\alpha-1} e^{-x/\beta} \, dx = 1\]
So: \[\int_0^\infty w^{p/2} e^{-w(1 + t^2/p)/2} \, dw = \Gamma\left(\frac{p+1}{2}\right) \left(\frac{2}{1 + t^2/p}\right)^{(p+1)/2}\]
Step 7: Substitute back
\[f_T(t) = \frac{1}{\sqrt{2\pi p} \, \Gamma(p/2) 2^{p/2}} \cdot \Gamma\left(\frac{p+1}{2}\right) \left(\frac{2}{1 + t^2/p}\right)^{(p+1)/2}\]
\[= \frac{\Gamma\left(\frac{p+1}{2}\right)}{\Gamma(p/2) \sqrt{2\pi p} \, 2^{p/2}} \cdot \frac{2^{(p+1)/2}}{(1 + t^2/p)^{(p+1)/2}}\]
\[= \frac{\Gamma\left(\frac{p+1}{2}\right)}{\Gamma(p/2) \sqrt{\pi p}} \cdot \frac{1}{(1 + t^2/p)^{(p+1)/2}}\]
\[= \frac{\Gamma\left(\frac{p+1}{2}\right)}{\Gamma(p/2) \sqrt{p\pi}} \left(1 + \frac{t^2}{p}\right)^{-(p+1)/2}\]
for \(-\infty < t < \infty\) . \(\blacksquare\)
This derivation illustrates the power of the transformation method. The key insights are: 1. Using independence to write the joint PDF 2. Choosing a convenient transformation (keeping \(w = v\) simplifies the Jacobian) 3. Recognizing the gamma kernel in the integral
The symmetry of the \(t\) distribution is evident from the even power of \(t\) in the denominator.
Definition (Student’s \(t\) )
\(T \sim t_p\) if \(T = Z/\sqrt{V/p}\) where \(Z \sim N(0,1)\) , \(V \sim \chi^2_p\) , \(Z \perp V\) .
PDF: \[f_T(t) = \frac{\Gamma\left(\frac{p+1}{2}\right)}{\Gamma\left(\frac{p}{2}\right)\sqrt{p\pi}} \left(1 + \frac{t^2}{p}\right)^{-(p+1)/2}\]
Note: \(t_1 = \text{Cauchy}\) (no mean!)
Properties of \(t\) Distribution
\(E[T_p] = 0\) if \(p > 1\)
\(\text{Var}(T_p) = p/(p-2)\) if \(p > 2\)
As \(p \to \infty\) : \(t_p \to N(0,1)\)
Symmetric about 0
Heavier tails than normal
The heavier tails account for the additional uncertainty from estimating \(\sigma\) with \(S\) . As \(n \to \infty\) , \(S \to \sigma\) and the \(t\) approaches the normal.
Snedecor’s \(F\) Distribution — Definition
Motivation: Compare variances from two populations
If \(X_1, \ldots, X_n \sim N(\mu_X, \sigma_X^2)\) and \(Y_1, \ldots, Y_m \sim N(\mu_Y, \sigma_Y^2)\) independent samples:
\[F = \frac{S_X^2/\sigma_X^2}{S_Y^2/\sigma_Y^2} = \frac{\chi^2_{n-1}/(n-1)}{\chi^2_{m-1}/(m-1)} \sim F_{n-1, m-1}\]
Definition (Snedecor’s \(F\) )
\(F \sim F_{p,q}\) if \(F = (U/p)/(V/q)\) where \(U \sim \chi^2_p\) , \(V \sim \chi^2_q\) , \(U \perp V\) .
PDF: \[f_F(x) = \frac{\Gamma\left(\frac{p+q}{2}\right)}{\Gamma\left(\frac{p}{2}\right)\Gamma\left(\frac{q}{2}\right)} \left(\frac{p}{q}\right)^{p/2} \frac{x^{(p/2)-1}}{\left(1 + \frac{p}{q}x\right)^{(p+q)/2}}\]
Relationships Between \(\chi^2\) , \(t\) , \(F\)
(a) If \(X \sim F_{p,q}\) , then \(1/X \sim F_{q,p}\)
(b) If \(T \sim t_q\) , then \(T^2 \sim F_{1,q}\)
(c) If \(X \sim F_{p,q}\) , then \(\frac{(p/q)X}{1 + (p/q)X} \sim \text{Beta}(p/2, q/2)\)
Part (b) is particularly useful — it connects \(t\) -tests to \(F\) -tests. A two-sided \(t\) -test is equivalent to an \(F\) -test with 1 numerator df.
Summary Diagram
\[N(0,1) \xrightarrow{\text{square}} \chi^2_1 \xrightarrow{\text{sum of } p} \chi^2_p\]
Derived distributions:
\[t_p = \frac{Z}{\sqrt{\chi^2_p / p}} \qquad \text{where } Z \perp \chi^2_p\]
\[F_{p,q} = \frac{\chi^2_p / p}{\chi^2_q / q} \qquad \text{where } \chi^2_p \perp \chi^2_q\]
Key results:
\(\bar{X} \sim N(\mu, \sigma^2/n)\)
\((n-1)S^2/\sigma^2 \sim \chi^2_{n-1}\)
\(\bar{X} \perp S^2\) (under normality)
\((\bar{X} - \mu)/(S/\sqrt{n}) \sim t_{n-1}\)